Amendments to the Claims ; 

This listing of claims wiH replace all prior versions, and listings, of claims in the 
application: 

1. (currently amended) A method for identifying a macromolecule having a 
sequence and sequence modifications thereof from mass spectrometry data 
comprising: 

(a) providing at least one de novo sequence from mass spectrometry data of 
sequences of fragments of said macromolecule to a computer-based system that 
implements the identification of sequences of molecules and sequence modifications 
from mass spectrometry data , 

(b) calculating at least one mass-based alignment between said at least one de 
novo sequence and sequences in a sequence database, wherein the molecular masses 
of said fragments in a de novo sequence are compared to molecular masses of 
fragments in each sequence in said sequence database, 

(c) interpreting mass differences between a sequence in the sequence database 
and the at least one de novo sequence using a modification catalog, said mass 
differences having been identified within said mass-based alignment, 

wherein said modification catalog contains information for accurately identifying 
sequence variations and post-translational protein modifications, 

(d) calculating at least one match score for the mass-based alignment that 
provides an indication of matching between a sequence in the sequence database and 
the at least one de novo sequence, 

(e) identifying sequences in the sequence database from mass-based 
alignments in response to the match scores, 

(f) grouping identifications of sequences in the sequence database from at least 
one de novo sequence into an identified macromolecule list that agrees with the de 
novo sequencing results, and 

(g) storing said macromolecule list on a computer readable medium. 

2. (previously presented) The method of claim 1, wherein the mass 
spectrometry data are generated from a tandem mass spectrometer device. 
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3. (previously presented) The method of claim 1 , wherein at least one de 
novo sequence is an estimated sequence of fragments generated from the mass 
spectrometry data. 

4. (previously presented) The method of claim 3, wherein a cfe novo 
sequence is a complete or partial sequence of fragments. 

5. (previously presented) The method of claim 3, wherein a de novo 
sequence contains ambiguous mass regions of molecules where the exact sequence of 
fragments cannot be determined. 

6. (previously presented) The method of claim 5, wherein a mass region is 
the molecular mass of the fragments in an unidentifiable region of molecules. 

7. (previously presented) The method of claim 1 , wherein at least one 
fragment is an amino acid and at least one sequence of fragments is a peptide. 

8. (original) The method of claim 7, wherein the peptides are derived by an 
enzymatic digestion of proteins. 

9. (original) The method of claim 7, wherein the sequence database is a 
database of amino acid sequences of proteins. 

10. (original) The method of claim 7, wherein the sequence database is a 
database of amino acid sequences derived from nucleotide sequences. 

1 1 . (original) The method of claim 7, wherein the sequence database is a 
database of de novo peptide sequences. 

12. (previously presented) The method of claim 7, wherein the sequence in 
the sequence database is a particular amino acid sequence. 
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13. (previously presented) The method of claim 6, further comprising: 

(h) identifying a sequence in the sequence database with a tag match, and 

(i) generating a mass-based alignment between a de novo sequence and the 
sequence in the sequence database. 

14. (original) The method of claim 13, wherein a mass-based alignment is a 
series of consecutive local mass-based alignments on either side of a tag match. 

15. (original) The method of claim 14, wherein a tag match is when a tag in 
the de novo sequence has been shown to be equivalent to a tag in a sequence in the 
sequence database by way of a tag search. 

16. (original) The method of claim 15, wherein a tag search is used to identify 
a subset of sequences in the sequence database from which to compute mass-based 
alignments. 

17. (previously presented) The method of claim 16, wherein a tag is a 
sequence of consecutive fragments of a specified length, wherein the specified length is 
2 to 4 fragments. 

18. (previously presented) The method of claim 16, wherein single fragments 
of the tag and sequences in the sequence database that have the same nominal weight 
are represented by a single fragment. 

19. (previously presented) The method of claim 14, wherein fragments at 
either side of the tag match in both the de novo sequence and the sequence of the 
sequence database are converted into mass objects. 

20. (original) The method of claim 19, wherein a mass object is at least one 
molecular mass and a name for that mass. 

21 . (previously presented) The method of claim 18, wherein for single 
fragments, mass objects are assigned the molecular mass of the single fragment. 
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22. (original) The method of claim 18, wherein for unidentifiable mass 
regions, mass objects are assigned the molecular mass of the unidentifiable mass 
region. 

23. (original) The method of claim 18, wherein for reference amino acids, 
which represent multiple amino acids, mass objects are assigned the molecular mass of 
each amino acid. 

24. (original) The method of claim 19, wherein for variably modified amino 
acids, mass objects are assigned multiple molecular masses. 

25. (previously presented) The method of claim 19, wherein mass regions are 
treated as single fragments with a single molecular mass. 

26. (previously presented) The method of claim 19, wherein a gap is a mass 
object of zero molecular mass that represents no fragment. 

27. (original) The method of claim 19, wherein a local mass-based alignment 
is a matching of at least one consecutive mass object in the sequence in the sequence 
database and at least one consecutive mass object in a de novo sequence. 

28. (previously presented) The method of claim 27, wherein each local mass- 
based alignment is generated with a breadth-first search, wherein all possible 
sequential combinations of mass objects of a specified number of mass objects are 

compared. 

29. (previously presented) The method of claim 28, wherein said specified 
number of mass objects used in the breadth-first search is a search depth. 

30. (original) The method of claim 29, wherein the search depth is 3-5. 

31 . (previously presented) The method of claim 21 , wherein a breadth-first 
search is used identify the local mass-based alignment as either a mass match, a 
substitution, or a gap match. 
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32. (previously presented) The method of claim 31 , wherein said breadth-first 
search first tries to identify a mass match, as a local mass-based alignment where the 
sum of the molecular masses of the consecutive mass objects in the sequence in the 
sequence database and the sum of the molecular masses of the consecutive mass 
objects in a de novo sequence are equal within a specified mass tolerance. 

33. (original) The method of claim 31, wherein if there are no mass objects 
left on the side of the tag match in the sequence in the sequence database, a gap 
match is identified as a local mass-based alignment between a gap and at least one 
consecutive mass object in either the sequence in the sequence database or the cfe 
novo sequence. 

34. (original) The method of claim 31 , wherein if a mass match or a gap 
cannot be identified, then the breadth first search identifies a modification site as a local 
mass-based alignment where the sum of the molecular masses of the consecutive 
mass objects in the sequence in the sequence database and the sum of the molecular 
masses of the consecutive mass objects in a de novo sequence are not equal within a 
specified mass tolerance. 

35. (original) The method of claim 31, wherein the number of mass objects in 
the de novo sequence and the number of mass objects in the sequence database is 
minimized. 

36. (original) The method of claim 31, wherein the specified mass tolerance is 
designated by a mass tolerance of a tandem mass spectrometer device that generates 
the mass spectrometry data. 

37. (previously presented) The method of claim 28, wherein after all 
fragments are matched in the breadth-first search in each respective sequence, a new 
local mass-based alignment is generated starting from next fragment following the last 
fragment that was matched in the cfe novo sequence and next fragment following the 
last fragment that was matched in the sequence in the sequence database. 
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38. (original) The method of claim 37, wherein a series of local mass-based 
alignments are made until the entire de novo sequence has been accounted for by the 
sequence in the sequence database in the mass-based alignments. 

39. (previously presented) The method of claim 38, wherein a maximum 
number of consecutive modification sites are identified. 

40. (previously presented) The method of claim 39, wherein the maximum 
number of consecutive modification sites is 1 or 2 local mass-based alignments in 
length. 

41. (previously presented) The method of claim 39, wherein modification 
information is cataloged in a modification catalog. 

42. (previously presented) The method of claim 41 , wherein said modification 
information includes at least one of, molecular mass of the modification, specific 
fragments where the modification occurs, a frequency of occurrence of the modification 
at those fragments, wherein the frequency of occurrence is the estimated frequency in 
nature or a frequency as a sample preparation artifact, a mass object for the 
modification, which represents the additional mass of the modification to the de novo 
sequence at those fragments, the name of the modification, and a modification score for 
the modification. 

43. (previousFy presented) The method of claim 42, wherein said modification 
is selected from, an in vivo or in vitro protein, a peptide modification, and an amino acid 
substitution. 

44. (previously presented) The method of claim 43, further comprising: 
ranking the modification, wherein the ranking is based on its frequency of occurrence. 

45. (previously presented) The method of claim 44, further comprising: 
identifying a most probable modification in the modification site from the modification 
catalog by matching elements to elements in modifications in the modification catalog 
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that are selected from at least one of, the mass difference, the fragments in the 
sequence database in the modification site, and the ranking of the modification in the 
modifications catalog. 

46. (original) The method of claim 45, wherein the mass difference is the 
difference between the sum of the molecular masses of the consecutive mass objects in 
the sequence in the sequence database and the sum of the molecular masses of the 
consecutive mass objects in a de novo sequence in a local mass-based alignment. 

47. (previously presented) The method of claim 45, wherein the mass object 
of an identified modification is inserted into the mass-based alignment, which creates a 
mass match between the de novo sequence and the sequence in the sequence 
database. 

48. (previously presented) The method of claim 38, further comprising: 
computing a match score of each mass-based alignment, wherein the match score 
being a measure of how well the sequence in the sequence database matches the de 
novo sequence. 

49. (previously presented) The method of claim 48, wherein said match score 
is generated from the linear combination of local alignment scores from the series of 
local mass-based alignments. 

50. (original) The method of claim 49, wherein each of a series of consecutive 
local mass-based alignments receives a score and is classified, 

51. (original) The method of claim 50, wherein each local alignment score is 
generated using a substitution matrix, depending on whether the local alignment is a 
mass match, a modification site, or a gap match. 

52. (previously presented) The method of claim 51 , wherein the substitution 
matrix contains a substitution matrix score of at least one fragment. 
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53. (previously presented) The method of claim 52, wherein the substitution 
matrix identity score is a substitution matrix score between a fragment and itself. 

54. (previously presented) The method of claim 53, wherein the substitution 
matrix substitution score is a substitution matrix score between a fragment and a 
different fragment. 

55. (previously presented) The method of claim 54, wherein the substitution 
matrix score is the log of the odds score of an identity of a fragment or a substitution 
between two fragments. 

56. (previously presented) The method of claim 52, wherein the local 
alignment score for a mass match is the average value of the substitution matrix identity 
scores for all of the fragments in the sequence in the sequence database matched in 
the local alignment. 

57. (previously presented) The method of claim 56, wherein if at least one of 
the fragments has been modified by a modification, the substitution matrix score for 
each modified fragment is the modification score for that modification. 

58. (previously presented) The method of claim 52, wherein if the local mass- 
based alignment is a match between only one mass object from the sequence in the 
sequence database, and only one mass object from the de novo sequence, and tbat 
those mass objects represent single fragment, then the local alignment score for a 
substitution is the substitution matrix substitution score between the fragment in the 
sequence in the sequence database and the fragment in the de novo sequence. 

59. (previously presented) The method of claim 52, wherein the local 
alignment score for a substitution is the number of fragments in the substitution in the 
sequence in the sequence database multiplied by the average value of the substitution 
matrix substitution scores. 
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60. (previously presented) The method of claim 52, wherein the local 
alignment score for a gap match is the number of fragments in the gap match in the cfe 
novo sequence multiplied by the average value of the substitution matrix substitution 
scores. 

61 . (previously presented) The method of claim 48, wherein if the termini of 
the de novo sequence are expected to be specific fragments, the match score is 
increased if the termini of the mass-based alignment match the expected specific 
fragments. 

62. (previously presented) The method of claim 48, wherein if the termini of 
the de novo sequence are expected to be specific fragments, the match score is 
decreased if the termini of the mass-based alignment do not match the expected 
specific fragments, or if expected specific fragments are present inside the mass-based 
alignment. 

63. (previously presented) The method of claim 1 , further comprising utilizing 
an approach that interprets matches between sequences in the sequence database and 
said de novo sequence, which have been scored by a match score, as an identified 
macromolecule list and assigns a macromolecule score to each sequence in said 
identified macromolecule list. 

.... _ 64. (previously presented). The. method. of.claim53 r _wherein. the. match_score_ 

is a measure of how well said sequences in the sequence database match the de novo 
sequence. 

65. (previously presented) The method of claim 64, wherein a de novo 
sequence that matches at least one sequence in the sequence database is given a 
classification and inserted into a de novo sequence list, and the de novo sequence in 
the de novo sequence list is ranked by its delta score, which is the difference between 
the scores of the first and second best alignments for a given mass spectrum. 
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66. (original) The method of claim 65, wherein the delta score is computed for 
the de novo sequence as the difference between the match scores of the first and 
second matches to sequences in the sequence database for that de novo sequence. If 
that de novo sequence only matches one sequence in the sequence database, the delta 
score is the match score for that match. 

67. (previously presented) The method of claim 66, wherein said 
classification is either a discriminating de novo sequence or a non-discriminating de 
novo sequence, wherein a discriminating de novo sequence is one that has a delta 
score greater than or equal to a delta score threshold and a non-discriminating de novo 
sequence is one that has a delta score less than a delta score threshold. 

68. (original) The method of claim 67, wherein the delta score threshold for 
the de novo sequence is between 0% and 25% of the match score of the highest 
scoring match between a sequence in the sequence database and that de novo 
sequence. 

69. (previously presented) The method of claim 67, wherein all matches 
between a sequence in the sequence database and a de novo sequence with match 
scores less than the match score of the highest scoring match between a sequence in 
the sequence database and that de novo sequence minus the delta score threshold are 
discarded. 

70. (previously presented) The method of claim 69, wherein the sequence in 
the sequence database that matches best to the discriminating de novo sequence in 
said ofe novo sequence list with the greatest delta score is added to the identified 
macromolecule list, whereupon said de novo sequence is then moved from the de novo 
sequence list. 

71 . (original) The method of claim 70, wherein all non-discriminating de novo 
sequences in the de novo sequence list that match to that sequence in the identified 
macromolecule list are moved from the de novo sequence list to that sequence. 



3588777_1.DOC 



11 



Response to Office Action 
Application No. 10/789,424 
Attorney Docket No. PTX-0003 



72. (previously presented) The method of claim 70, repeated until all 
discriminating de novo sequences in said de novo sequence list are removed from the 
de novo sequence list. 

73. (original) The method of claim 72, wherein all sequences in the sequence 
database that match to non-discriminating de novo sequences still in the de novo 
sequence list are added to the identified macromolecule list, and the non-discriminating 
de novo sequences still in the de novo sequence list are moved to those sequences. 

74. (original) The method of claim 73, wherein a macromolecule score is 
generated for every sequence in the identified macromolecule list. 

75. (original) The method of claim 74, wherein the macromolecule score is a 
linear combination of the de novo macromolecule scores of the de novo sequences that 
have been assigned to that sequence. 

76. (previously presented) The method of claim 64, wherein a new sequence 
database is generated and stored on a computer readable medium, containing only the 
sequences in the sequence database that are listed in the identified macromolecule list. 

77. (previously presented) The method of claim 76, wherein de novo 
sequences that do not match any sequence in the original sequence database are re- 
analyzed by calculating a mass-based alignment between each de novo sequence in 
question and sequences in the new sequence database, in a way that the search space 
explored by the mass-based alignment algorithm is increased. 

78. (original) The method of claim 77, further comprising: decreasing the 
specified length of tags. 

79. (original) The method of claim 77, further comprising: increasing the 
search depth. 

80. (original) The method of claim 77, further comprising: increasing the 
maximum number of consecutive substitutions. 
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81. (previously presented) The method of claim 64, wherein de novo 
sequences that do not match any sequence in the original sequence database are re- 
analyzed by calculating a mass-based alignment between each de novo sequence in 
question and sequences in a different sequence database. 

82. (currently amended) A method for identifying sequences of a 
macromolecule having a sequence and sequence modifications thereof from mass 
spectrometry data comprising: 

(a) providing at least one de novo sequence from mass spectrometry data of 
sequences of fragments of said macromolecule to a computer-based system that 
implements the identification of sequences of molecules and sequence modifications 
from mass spectrometry data , 

(b) calculating at least one mass-based alignment between each de novo 
sequence and sequences in a sequence database, wherein the molecular masses of 
said fragments in a de novo sequence are compared to molecular masses of fragments 
in each sequence in the sequence database, 

(c) interpreting mass differences between a sequence in the sequence database 
and the at least one cfe novo sequence using a modification catalog, said mass 
differences having been identified within said mass-based alignment, 

wherein said modification catalog contains information for accurately identifying 
sequence variations and post-translational protein modifications, 

- — (d)- calculating at least one match score for the mass^based alignment that - 

provides an indication of matching between the sequence in the sequence database 
and the at least one de novo sequence, and 

(e) storing said score on a computer readable medium. 

83. (original) The method of claim 82, further comprising: identifying 
sequences in the sequence database from mass-based alignments in response to the 
match scores. 
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84. (original) The method of claim 83, further comprising: grouping 
identifications of sequences in the sequence database from at least one de novo 
sequence into an identified macromolecule list that agrees with the de novo sequencing 
results. 
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